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Abstract 

This paper adresses the general issue of estimating the sensitivity of the expectation 
of a random variable with respect to a parameter characterizing its evolution. In finance 
for example, the sensitivities of the price of a contingent claim are called the Greeks. 
A new way of estimating the Greeks has been recently introduced by Elie, Fermanian 
and Touzi [6] through a randomization of the parameter of interest combined with non 
parametric estimation techniques. This paper studies another type of those estimators 
whose interest is to be closely related to the score function, which is well known to 
be the optimal Greek weight. This estimator relies on the use of two distinct kernel 
functions and the main interest of this paper is to provide its asymptotic properties. 
Under a little more stringent condition, its rate of convergence equals the one of those 
introduced in and outperforms the finite differences estimator. In addition to the 
technical interest of the proofs, this result is very encouraging in the dynamic of creating 
new type of estimators for sensitivities. 



Key words: Sensitivity estimation, Monte Carlo simulation. Non-parametric regres- 
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1 Introduction 

This paper is closely related to the work of Elie, Fermanian and Touzi [6] and we will try 
to follow their notations. Let A be some given parameter in M*^, and define the function 

where Z{.) is a parameterized random variable with values in and : M" — > M is a 
measurable function. A well understood issue is the numerical computation of the function 
l^'^(A) by means of a Monte Carlo procedure for example. A more difficult problem consists 
in approximating the sensitivity of with respect to the parameter A. For some given 
parameter A*^, we denote by (3^ the expression of interest defined by 

:= VAy<^(AO) = VxnHZ{X))]ix=xo (1.1) 

In financial applications, interprets as the no-arbitrage price of a contingent claim, de- 
fined by the payoff (f){Z{X)), in the context of a complete market with prices measured in 
terms of the price of the non- risky asset. The sensitivities of with respect to the param- 
eter A are often called Greeks, and their interest to practitioners is now well established. 



To our knowledge, as for the computation of those sensitivities, mainly three methods are 
considered. They are compared in detail in the survey paper of Kohatsu-Higa and Montero 
[To] and we just present briefly here their construction and main properties. 
First, the finite differences method consists in approximating the derivative of the price by 
its variation in response to a small perturbation e of the parameter of interest A : 

^0 ^ y^(AO + e)-y^(A°) _ 
e 

Given a number of Monte Carlo simulation for the prices, the choice of e is related to an 
equilibrium between the bias and the variance of the estimator. For discontinuous payoff 
functions (p, this method appears inefficient due to the poor precision of approximation (II. 2|) . 
A theoretical study of those estimators is reported in L'Ecuyer and Perron [5], Detemple, 
Garcia and Rindisbacher [4] or Milstein and Tretyakov |12| . 

Second, one can invert the differentiation and the expectation operators to obtain the path- 
wise estimator given by a Monte Carlo estimation based on the representation 

/3° = E [cP'{Z{X'))VxZ{\'>)] . 

This method, introduced by Broadie and Glasserman [3], therefore requires a lot of regu- 
larity on the payoff function (p as well as the computation of the tangent process VxZ of 
the underlying. Efficient numerical schemes for the implementation of this method can be 
found in Giles and Glasserman [8]. 

Finally, one can compute by reporting the differentiation operator on the regular distri- 
bution of the underlying Z{X). Whenever this random variable admits a density /(A, .) with 
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respect to the Lebesgue measure, we obtain the so-called likelihood ratio estimator based 
on 

/?o = E[0(Z(AO)>(AO,Z(AO))] , with s:=^. (1.3) 

The application of this trick in finance has also been introduced by Broadie and Glasserman 
[3]. This type of representation has been generalized by Fournie, Lasry, Lebuchoux, Lions 
and Touzi [7j who studied the properties of the random variables vr satisfying 

E [(/)(Z(A°))7r] , for any function (j) G £°°(]R", R) . 

By means of a Malliavin integration- by-parts argument, they characterized the set of the so 
called greek weights vr, allowing their tedious computation in some particular cases. Never- 
theless, beyond all those greek weight based estimators, the one related to the score function 
s and given by p.3|) leads to the smallest variance. 

As in [6], the main purpose of this paper is to study estimators of the Greek when- 
ever the payoff function lacks regularity and the density / of the underlying is unknown. 
As detailed in the next section, a randomization of the parameter of interest A allows to 
rewrite the sensitivity f3^ given by (II. 3p as a conditional expectation. Combining a non 
parametric estimation of this conditional expectation with a truncation argument and a 
kernel estimation of the unknown score function s leads to our estimator A slightly 
different form of without the useful truncation modification, is presented in [6], where 
it serves as a basis to introduce other ones through an integration by part argument. The 
main contribution of this paper is the presentation of the rather demanding derivation of its 
asymptotic properties suggested in [6]. The use of a truncated version of the classical kernel 
estimator allows to reduce the induced required assumptions on the coefficients. We provide 
the asymptotic mean square error and distribution of the proposed estimator, leading to 
the common calibration of the different parameters of simulation. 

Despite the more general form of it surprisingly achieves the same rate of convergence 
rate as the one introduced in [6]. From a practical perspective, we have to admit that, as 
argued in [6], its numerical implementation is more demanding. Nevertheless, the choice 
of the two distinct Kernel functions increases significantly the class of possible sensitivity 
estimators. From a technical point of view, the asymptotics of the estimator require a pre- 
cise derivation of the properties of a kernel estimator of the score function, which appear 
to be of great interest in themselves. Therefore, this paper offers a new contribution to 
the literature of the combination of several non-parametric estimators, and its particular 
application to the computation of the Greeks is furthermore promising in the development 
of competitive numerical computation of sensitivities. 
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The paper is organized as follows. In section [2l we present in detail the construction of this 
estimator. Its asymptotic properties as well as its practical implementation are discussed 
in Section [3l Finally, for ease of presentation, the proofs are reported in the last section. 

2 Construction of the estimator 

Throughout this paper, we consider a complete probability space {0,,J^,P) supporting a 
Brownian Motion W valued in M™. We assume that is the P-completion of the cr-algebra 
generated by W. Let Z(\) be a given random variable valued in and parameterized by 
A G M*^ and 4> £ -C°°(M", M) be a payoff function . The purpose of this paper is to construct an 
estimator of defined in (jl.ip as the sensitivity of V'f' with respect to A at a given point AO. 

We shall demonstrate in this section the intuition behind the construction of the suggested 
estimator. We first identify the score function s defined in p.3p as the optimal Greek weight 
in the sens of 0. Considering the realistic case where the score function is unknown, we 
propose to approximate it through a kernel estimation procedure. Combining Monte Carlo 
simulations with the randomization of the parameter A, we are able to construct a non- 
parametric estimator of the score function leading naturally to the estimation of P^. The 
reader interested by the asymptotic properties of the estimator should report directly to the 
next section. 

2.1 The score function as the optimal Greek weight 

We assume that the distribution of Z{X) is absolutely continuous with respect to the 
Lebesgue measure, and denote by /(A, .) the associated density. As announced in the 
introduction, under mild smoothness assumptions on the density /, we directly compute 
that 

/3° = E[<A[Z(A°)]s[A°,Z(AO)]] , with <,:=^ = VAln/. (2.4) 

In the context of the Black Scholes model, Broadie and Glasserman [3] noticed that this 
representation allows to compute f3^ by a direct Monte Carlo procedure. It is important to 
notice that the score function s only depends on the distribution of the underlying Z{X^). 
In a more general framework, Fournie, Lasry, Lebuchoux, Lions and Touzi [7] considered 
the set 

W := {7rG£2(0,M'^) : V xV^ (X^) = E [(I){Z^)tt] for all </> G £°°(M", M)} . 

Assuming that E |s[A°, Z(A°)] |^ < cx3, we already notice that s[A°,Z(A°)] G W. In 0, the 
authors construct a new characterization of the set W by means of a Malliavin integration 
by parts argument. After rather tedious computations, this representation allows sometimes 
to produce some alternative Greek weights tt to the score s[AO, Z(A°)]. When the density / 
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and therefore the score function s of the underlying are unknown, those alternative weights 
appear to be very helpful. 

Nevertheless, their obtention is unfortunately still limited to particular cases and the follow- 
ing argument demonstrate that the estimator based on the score s[X^, Z{X^)] is of minimal 
variance beyond the class of Greek weight based estimators. Indeed, from the arbitrariness 
of (/> G £°°(M",M), we rewrite 

W = {^TT e C\n,m'^) : E[7r|Z(A°)] = s[A°,Z(A°)]} . 

We then deduce that, for any vr E W, 

Var [^[Z{X^)]Tr] = E [(l)[Z E[Ti-n' \Z {X% - Vy'^(A°)Vy'^(A°)' 

> E [,^[Z(A°)]2^[^|Z(A°)]^[7r|Z(A°)]'] - Vy<^(A°)Vy'^(A°)' 

= E [0[Z(A°)]2s[A°, Z(A°)]s[A°, Z{\y] - Vy'^(A°)Vy'^(A°)' 

= Var[,^[Z(AO)]s[AO,Z(A°)]] , 

where ' denotes the transposition operator. Hence 

s[A°, Z(A°)] G W is a minimizer of Var [(/.[Z(A°)]7r] , vr € W . 

As in [6], we intend in this paper to construct a non parametric estimator based on the 
approximation of the optimal Greek weight given by the unknown score s[\^ , Z{X^)\. 

2.2 Randomization of the parameter 

In order to be able to estimate the unknown score function s, the idea is to create an artifi- 
cial density around the parameter A'^, on which we can report the differentiation operation. 
This well known technique in the non-parametric statistics litterature, see eg [1], is based 
on the randomization of the parameter of interest A. One may for example interpret the 
classical finite difference operator (II. 2p as a particular case of a randomizing distribution of 
A with two dirac masses at points A*^ and A'^ -|- e. 

We then introduce I : — > M some given probability density function, with support 
containing the origin in its interior and set 

ip{\, z) := £(A° - A) /(A, z) for A G M'^ and z G M" . 

Considering a couple of random variable (A, Z) with density if, we therefore rewrite as 

/3° = E[(t>{Z)s{A,Z)\A = \°] . (2.5) 

Although we restrict to the case where the density / of the underlying Z{\) is unknown, we 
still consider that we can simulate Z{X). This not a limitation in practice since Z{X) is typi- 
cally characterized by a stochastic differential equation, which can be classically discretized. 
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Hence, we introduce a sequence 

(Aj, Zj)i<j<Ar of A'' independent r.v. with distribution (/?, (2.6) 

so that, for any i < i{X^ — .) is the density of A* and /(A*, .) is the conditional density 
of given A* . 

We now introduce a kernel function : M'^ — > M, i.e. such that fj^d K = I. Given the N 
observations (Aj, Zj)i<j<Ar, the conditional expectation given by (12. Sp can be approximated 
by the classical kernel estimator 

where the bandwidth /i > of the estimator is a small parameter. 



This estimator is of course not implementable since the score function s is unknown. Never- 
theless, as detailed in the next paragraph, the extra regular source of randomness introduced 
by £ allows us to approximate s and leads to a computable estimator of P^. 

2.3 The double kernel based estimator 

In order to approximate the score function s, we shall first estimate the unknown density 
if of (A, Z). For this purpose, we introduce a second kernel function : — > M. Given 
— 1 observations (Aj, Zj)i<_j<_N we define (^~* the classical non-parametric estimator 
of the density ip given by 

We denote ip\~^{\, z) the derivative of this estimator with respect to A and we deduce 

Observe now that s and Lp are closely related since we easily compute 

s(A,z) = ^(A,z) = ^(A,z)-^(A°-A), for A e and z E M'^ . 

J ^ £ 

Given the observations {^j-, Zj)i<j<N this naturally leads to the following estimator 
s^* of the score function s given by 

«V(A,^) := (A, z) + f " , (2.9) 

(/9 ' + {5/^-ip *)l|^-z|<5/3 £(A'J-A) 
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with 5 some small fixed parameter ensuring that the estimator (^~* stays away from zero. 
This technical truncation will simply ensure the non explosion of the estimator, and the 
convergence of the estimator will necessitate some control on the small values of the true 
density if detailed in Assumption S below. 

In order to construct an estimator of /3°, we now replace in (|2.7p each score s(Aj, Zj) by the 
approximation s]^*(Aj, Zi) based on the A^ — 1 remaining observation. Our estimator is thus 
defined by 



Based on this type of representation, Elie, Fermanian and Touzi [6] introduce two other esti- 
mators by means of an integration by parts argument. Even if the representations proposed 
in [6] appears more simple, we surprisingly show in the next section that our estimator 
(I2.10p achieves a similar rate of convergence, under a few more stringent conditions. Even if 
the practical implementation and computation of (3n is more time consuming, the general 
form of (12. op offers a large class of possible estimators, related to different Kernel functions 
K and H. Since the rate of convergence of these estimators is similar, we sincerely believe 
that this result is very encouraging in the dynamic of creating new type of estimator for 
sensitivities. Moreover the technical proof for the convergence of the estimator appears to 
be of great interest in itself. 

3 Asymptotic properties 

This section presents the main results of the paper. We first provide the asymptotic prop- 
erties of the estimator (3^ defined in (I2.10p . In particular, the obtention of the asymptotic 
mean square error of the estimator leads to the common optimal choice of the number of 
simulations A^ and the bandwidth h of the two kernel functions K and H. 

3.1 Notations 

Before stating our results, we recall that the order of a kernel function ii' : M*^ — > M is 
defined as the smallest non zero integer p such that there exist some integers (ji, . . . ,jp), 
with jk £ {1,... ,d}, satisfying 



Typically, if K is the product of d even univariate kernels, then it is (at least) of order p = 2. 




(2.10) 




for < r < p, Ok £{!,..., d}, and 
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In the subsequent subsections, the kernel functions K and H will be respectively of order p 
and q, and we shall use the notations 

eKm\z) ^ E (jin---hMi)di)v%^,„,j{\z), (3.11) 

eum\z) := E (^lvn---Vj,H{v)dv^Vl^,„,^^i^{X,z), (3.12) 

for every smooth function ip defined on M'' x R". We shall also denote A® := AA' for every 
matrix ^, and C denotes a constant whose value may change from line to line. 

3.2 Asymptotic moments and distribution of the estimator 

We shall work under the following three assumptions concerning respectively the kernels K 
and H, the payoff function 4> and the unknown density function /. 

Assumption KH K and H are the product of some univariate compactly supported lip- 
schitz kernels with orders respectively p and q, and VK has bounded variation. 

Assumption S (p is continuous and has compact support. Moreover, there exist 6 > 
such that, for every z S M", inf {ip{X, z) : (A, z) G V(A'^) x C^^ > 6, for some neighborhood 
V(A'^) of , and some compact subset o/M" with Supp(^) C mt(C^). 

Assumption R For every X, the function Va/(A, •) is q times differentiable, and for 
every integer j < q, the function X i — > \/i\/xip{X, z) is continuous at X = X^ uniformly 
with respect to z £ S, for some subset S s.t. Supp(0) C int(S'). 

For every z, the functions f{-,z) and £ arep + 1 times differentiable, and 
for every integer i < p+l, the function X i — > V\f{X, z) is continuous at X^ uniformly with 
respects to z G S, for some subset S s.t. Supp(i^) C int(5). 



Remark 3.1 We have to admit that Assumption S is at first glance rather restrictive on 
the class of possible payoff functions for financial applications. Nevertheless, we observe that 
most of the classical ones can be included. In particular, the call option can be considered 
here even if the payoff does not have compact support. One just need to approximate the 
greeks associated to the associated Put option and use the correspondence provided by the 
Call-Put parity relation satisfied in any arbitrage free market. 

We first present the asymptotic bias and variance of the estimator. 

Proposition 3.1 Under Assumptions KH, S and K, choose N and h so that 

h — > and ,^ \_ — > as N ^ oo . (3.13) 
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Then, the bias and the variance of satisfy 



E 



/3_ 



N 



CihP + Coh'i + 



and Var 



N 



Nhd+2 ' 



(3.14) 



where 



C2 
C3 



m 
1 

W) 
1 



{X'^ , z)(p{z) dz 



£{0) J f{X'^,z 

m 



K{l2-li)K{li)VK{h)H^{v)dlidl2 dvdz 
K{l2 - li)VK{li) dh \ dh . 



We now turn to the asymptotic distribution of the estimator. 
Theorem 3.1 (i) Under the conditions of Proposition \3J\ we have 



law 
N^oo 



(ii) // in addition A^/i'^+2+2(pAg) _^ ^^g^ ^^^^ vanishes and 



,0^ law 

N^oo 



The technical proofs of Proposition 13.11 and Theorem 13.11 are reported in Section [H 



Remark 3.2 Note that the condition n < (p Aq) + 1 is necessary in order to satisfy (I3.13P 
and the condition of (ii). Thus, for basket derivatives or bermudean options in finance, it 
is necessary to consider high-order kernels, which is not a limitation in practice. 

3.3 Dependence with respect to the price process dynamics 

One should typically imagine the random variable Z as the terminal value of a price process 
X-^, whose dynamics are given by a parametrized stochastic differential equation of the 
form: 



= x{X) , dX^ = fi{u, A, X^)du + a{u, X, X^)dWu, . 



(3.15) 



where x : 



\ : [o,r] X 



and a : [0, T] x M'^ x 



Aljj are 

deterministic lipschitz functions. In this case, Z = X^ can be simulated easily via any time 
discretization scheme, even if its density / is unknown. 

We detail in this paragraph how the regularity of / required in Assumption R can be 
induced from conditions on the coefficients x, /x and a. First, the absolute continuity of X^ 
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is ensured by the classical uniform ellipticity condition: suppose the matrix aa^ is symetric, 
positive and there exists a constant > 1 such that 

—Id{x)<cra'^{t,X,x)<c^Id{x) V(t, A, G [0, T] x M"^ x M" . (3.16) 

Co- 
Second, the density / of inherits the regularity of the coefficients x, fi and a through the 
properties of the corresponding transition densities. Following the arguments of Theorem 
A. 2. 2 p. 478 in |2], see also Proposition 5.1 in [9], Assumption R is satisfied whenever (I3.16P 
holds, £ is of class C^, x is of class C"*"^, and the coefficients // and a are of class in 
{t,X,x), CP+^ in A as well as 

It is worth noticing that this analysis gives rise to more tractable assumptions for Proposition 
13.11 and Theorem 13.11 in the realistic framework where Z is the terminal value of a price 
process with dynamics of the form (|3.15|) . 

3.4 Optimal choice of N and h 

We investigate in this section the optimal balance between the number of simulations N 
and the bandwidth h. As announced in remark [3?2l we suppose that n < {pAq) + l. Under 
this condition and the assumptions of proposition 13.11 we obtain a simplification in the 
asymptotic expression of the bias and the mean square error of the estimator rewrites 

MSE(^;v) := e[|/3^-/30|2] ^ + + jCap/i^V 

Minimizing the MSE in /i, we get the asymptotically optimal bandwidth selector : 

(\ l/(d+2(pAg)+2) 
{d + 2)Tr{^) \ '"-^ ' 

2ipAq)\Cilp<g+C2l,<p\m) ■ ^^-^^^ 

Therefore h is of order iV-i/(d+2(pA5)+2) ^ leading to a MSE of order iV-2(pAg)/(d+2{pAq)+2) ^ 
Consequently, despite its more complicated form, the double kernel estimator achieves the 
same rate of convergence as the one introduced in [6]. The only constraint is the use of 
kernel functions of order sufficiently large, i.e. satisfying p A q > n — I. Since, given a large 
number of simulations, one should always use a kernel function of high order, this constraint 
is not relevant in practice. 

3.5 Remarks and extensions 

In this section, we regroup some remarks and possible extensions of the method, which 
unfortunately go beyond the scope of the paper. 

Considering a randomizing distribution i with radius equal to the bandwidth h, we can 
improve the rate of convergence of the estimator. Indeed, the asymptotic variance of the es- 
timator then reduces to a term of order l/VNh"^, leading to a MSE of order A^~(p^'?)/(pAg)+i 
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Remarkably, the speed of convergence of the estimator does not depend in this case on the 
dimension of the underlying X. For a continuous payoff function, the best finite differences 
estimator achieves an MSE of order N~'^/^, see [4]. Therefore this estimator outperforms 
the finite differences one as soon as p A g > 4 V (77, — 1). We choose to omit the proof of this 
result which is technically rather demanding. 

With no doubt, the choice of the randomizing function i is crucial for the precision of the 
estimator presented here. In the particular case of a uniform randomizing distribution i, 
the analytical form of the estimator simplifies and, after tedious asymptotic developments, 
we can see that the optimal choice for the radius of the distribution i is the bandwidth h 
of the kernel function K, i.e. the particular case discussed above. Prom an empirical point 
of view, the optimal choice of the randomizing density £ should be intimately related to the 
choice of the Kernel function K. A simple example where these two density functions are 
identical can naturally be considered. 

As for the practical calibration of the optimal bandwidth h given by (|3.17l) . we need to 
estimate the constants Ci, C2 and S. As for the choice of the bumping parameter of the 
finite differences estimator, they can be approximated by a preliminary Monte Carlo proce- 
dure with very few simulations. For example, the procedure proposed in [6], can be directly 
adapted to this setting. 

Finally, a generalization of the above estimator could be considered by taking two different 
bandwidths. Intuitively, the bandwidth for the estimation of the score function introduced 
in (12.8(1 should be smaller than the one considered for the approximation of the conditional 
expectation in (|2.7p . Indeed, the signification of those two parameters are rather different, 
but this question is left for further research. 

4 Proofs 

This section is dedicated to the proof of Proposition 13.11 and Theorem 13.11 characterizing 
the asymptotic behavior of I3n- In this section, we shall always work under the Assumptions 
of Proposition 13.11 

4.1 Preliminaries 

Recall that 

1 ^ /\0_A.\ 

where 
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with (p := (p * + ((5/3 — ip *)l|,^-i|<<5/3 ^ truncated version of ip *(A,z) defined by 

For every A, z, we set 

^(A, z) := E[^-^{\, z)] = J K{l)H{v)^{\ -hl,z-hv) dl dv , 

and its derivative is given by 

(^a(A,z) = JvK{l)H{v)^p{X-hl,z-hv)dldv 

Arguing as in the proof of Proposition 4.1 in [6], a Taylor expansion combined with a 
classical change of variable leads to 

^{X,z)-^{X,z) = e^M(A,z) + ^hMK^) + oiW^'i). (4.19) 

Similarly, we get 

(^a(A,z)-(^a(A,z) = eK[vx]{\z)hP + eH[vx]{\z)h'i + oiH^'^'^). (4.20) 

Remark 4.1 Since 4) and K have compact support by Assumption S, it follows that, for 
sufficiently small /i, the sum in (|4.18|1 is restricted to pairs (Aj, Zi) with values in Ck x 
where Ck C V(A'^) is defined in Assumption S, and is a compact subset of such that 
Supp(/) C C^. 

For any function ^ defined on Ck x C,^, we set 

llV'lloo := sup \i){\z)\, 
{A,z)gCkxC^ 

and, in the following, ||.||r refers to the £r(f^)-norm. 

Remark 4.2 By Assumption R, since (A, z) vary in a compact subset of x R", the 
remainder terms in (|4.19l) and (I4.20p are uniformly bounded in (A, z). By the same argument, 
we also see that CxM; C/fMi C^'t'/'-'^] ^^"^ ^IfiV'A] are uniformly bounded so that : 

\\^-^\\^ = 0{hP'"i) and y-x-^xW^ = OihP'^'i) . (4.21) 

We now study further the tails of the estimators (^~* and we obtain the following estimates. 
Lemma 4.1 There exists ai and a2 such that 

supP[|(^-*-<^|(A,z) >t] < 2e -i+"2t''^'' , (A, z) e Cic X C0 . (4.22) 

i<N 
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Furthermore, for any t > 0, there exists Ct > and Q > satisfying 



sup 11^9 
i<N 



" ' - Viloo > t 



Finally, for any integer r > 1, we have 



sup \\(p ^ — (p\ 

l<i<N 



2r 



^( In(jV) 



(4.23) 



(4.24) 



Proof. Observe first that there exists ai and 02 such that, for any {X, z) G Ck x C^, 
the random variables K[{X — A'^)/h]H[{z — z'^)/h] are bounded by 802/2 and, by the usual 
change of variable, their variance are bounded from above by q;i/i'^"'""/2. Therefore (I4.22p 
follows directly from the Bernstein inequality. 



We now turn to the proof of the second estimate and first observe that 



sup||(^ * 

i<N 



< A^P[||(^-(^||oo >t], 



(4.25) 



where, for ease of notation in this proof, we introduce (p := <^"^. Applying the Liebscher 
strategy, see [11], we recover the compact set Ck x by Cq {Rj^^h)~'^~"' balls Bj := 
B{{Xj, Zj), RN^h), with Co a constant chosen large enough. On each ball Bj, we have 



sup\ip-ip\ < \ip - ip\{Xj,Zj) + sup \ip{X, z) - Lp{Xj , Zj] 



+ sup \^{X,z) - (f{Xj,Zj)\ 



(4.26) 



According to Assumption KH, the kernel functions K and H are lipschitz and compactly 
supported. Therefore, there exists Af > such that 

sup_ \ip{X,z) - ip{Xj,Zj)\ < C—^i){Xj,Zj), 



(A,z)gBj 



where ij) is the classical histogram kernel estimator of the density defined by 

1 ^ 



-\\<Mh^Zi-z\<Mh ■ 



i=l 



Introducing the notation ip := E[^/;] and choosing RN,h such that RN,h = o{h), we then 
deduce from (|4.26|) that 



R 



sup\ip-ip\ < \(p - ip\{Xj, Zj) + \ip - ip\{Xj,Zj) + 2(7 ^"^'^ ip{Xj,Zj) . 



Bi 



h 



Summing up over all the balls Bj, we get 

F[\\ip - ^\\^ > t] < Coi?^;f+"^ (F[\ip-^\{X,,Zj)>t/3]+FU-i^\{^,,Zj)>t/3] 
+Coii^^f F[2Ch-^R^^h m^j, Zj) > t/3] . 
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Therefore, applying estimate (|4.22l) to both kernel estimators (p and V', we deduce the 
existence of 71 and 72 satisfying 

i^—Nh'^+" ~ r R? 



\ip-ip\\^ >t\< CR 



-{d+n) 
N,h 



g 71+72* 



+ . 



2C- 



lN,h 

h 



mx,,zj)>t/3 



(4.27) 



But ip is bounded so that for any given t the last term on the right hand side equals for 
h small enough. Since Nh'^'^"' — > 00 according to (|3.13p . choosing RN,h = /i^, we deduce 
(103]) from Km . 

We now turn to the moment inequalities and introduce the notation 

•= 1 /AT\ -'Poo, 
In(iV) i<Ar 

so that we simply need to prove HVAflbr < 00 for all integer r > 1. Fix r e N* and observe 
that 

roo poo 

E [y^^] = / 2rs2^-^P[y7v > s\ds < Ca+ 2rs^''-^f[YN > s]ds , (4.28) 

Jo J a 

for any a>0. We now fix s large enough and take RN,h = h In(A^) / \/ Nh^+"' in (|4.27|) and 
(I4.25p . so that we get, for N large enough, the existence of 61 and 62 satisfying 



s In(Ar)^ 



F[Yn > s] < CN 



\ hln{N) 



6i+62sln{N)/\/ Nh<i+" 



bmce \n{N)/VN¥+^ ^ and ^ ^ 0, we deduce that for N large enough, we have 



s In(JV)^ 



Plugging this estimate into (I4.28P completes the proof. 



□ 



Since VK has bounded variation, the exact same reasoning can apply to the estimators ip^ ' 
and we similarly derive 



sup \\px -9?a| 

l<i<Af 



o 



( In TV 



2r 



(4.29) 



The estimates of the previous lemma also allow to control the error due to the truncation 
of (^"*. Indeed, since the function admits as a lower bound according to Assumption S, 
it follows from (|4.21|) that that (p > 25/3 for h small enough, and (|4.22p leads to 

F[\p~\X,z)\ <6/3] < F[\ip-^ -^\{X,z)>6/3] < 2 6-^^^"^". (4.30) 

Introducing (p^ := E ['^~^''^] , we derive 



if -if 



< { sup F[\r'\iX,z)<6/3] < ^e-^^^'^" 
CkxC^ 



(4.31) 
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and combining (|3.13p and (14.211) . we deduce 
Similarly, applying (|4.23j) . we get 



O {hP^i) 



sup 

l<j<Af 



< (51 



2r 



sup \\(p ' - v^lloo > S/3 



i<N 



Observe also that (|4.31|) and (I4.33|) combined with (|3.13|) allows to derive 

/ hiiV 



sup 

l<i<N 



--i.S -5 
if ' -if 



o 



2r 



Finally, since (A, z) vary in a compact subset. Assumptions R and S imply that 

llV'lloo + IIv^aIIoo + IIV</'lloo < 

It then follows from equation (I4.2ip . (I4.32P and the truncation procedure that 



+ IIv'"a|Ioo + IIV<^IIoo+ V<^' 



+ sup 

\<i<N 



(4.32) 



(4.33) 



for any r G N* . (4.34) 



(4.35) 



< oo. (4.36) 



4.2 A suitable decomposition 

For any G N and i < N, we define the following functions tj jy, . . . ,tf jy on R'^ x R"- x ^1 : 

,1 _ .2 _ 'fx - fx ,3 _ (y - f^) fx 4 _ {f-f^){fXf-f^fx) 



f 

,5 ._ fX~' - fx ,6 



if- 

{^'-r'^')^-X ,7 ._ ifX-' - fx) if' - f') 



f 



((^<5)2 ' 'i^N ■ 



^5 r.,-i,S\2, 



and tj jy : = 



0-i,5 (^ipSj2 



SO that Sjy'{Ai,Zi) = y^jl j^{Ai,Zi) . 

i=i 

This implies the following decomposition of the estimator (3n '■ 

Pn = Y1 where := — — ^ m) <^(A., K ' 
for every j = 1, . . . , 9. By (|4.35|) and (|4.36p . we observe that 



< OO , for all j = 1, . . . ,4. 



(4.37) 
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Lemma 4.2 For any j = 1, . . . , 4, we have E 



O 



Proof. The result is derived from the following inequality: 



< 



< 



1 



^(0) 
1 



E 



ct>{Zi)t{j^{Ai,Zi) K 



AO-Ai 



^(0) 



{z)t{^{X° -hl,z) K{l)dldv 



< C\\t 



1 AtMoo • 



□ 



Lemma 4.3 For every j = 1, . . . ,4, Var(T^) = O {N-'^h-'^\\ ||^) . 

Proof. For any j =, 1 . . . , 4, the random variables T^(Aj, Zi) are independent and 



Var[r^ 



Ni 



< 



< 



1 

1 

£(0)2 Nh^^ 
W IP 



Var 

E 



(t>{Zi)t{j,{ki,Zi) K 



h\Z,)t{^iA,,ZrfK''^' 



h 



02 (z) K'^{l)dldv. 



□ 

The analysis of T^, for j > 4, requires more effort because of the dependence between the 
random variables t:?^(Aj,Zj). 

Lemma 4.4 E[r^] = andN&T{T%) ~ t/{Nh'^'^^) where t is defined in Proposition 



Proof. We introduce for any i = 1, . . . , N and j = 1, . . . , : 



fi^i, Zi) 



h 



h 



h 



so that can be re-written in 



-2d-n-l 



£(0) A^(A^ 



By definition, for any i = 1, . . . , A^ and j = 1, . . . , A^ with i / j, we have 

1 



E 



h 



h 



Therefore, E[7ij] = whenever i ^ j, leading to E[r^] = 0. 
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Since the Tij are not independent, the computation of the variance requires to decompose 



T% into 



where 



T% = T^/+T^/, (4.38) 



i, — 2d—n—l 
i. — 2d—n~l 



and6(A,z) := E[Ti2\A2 = X,Z2 = z]. 



1. Let first study the term . 



Setting Tij := Tij + Tji — b{Ai, Zi) — b{Aj, Zj), we derive the key property : 

E[Tij\Ai, Zi] = E[Tij\Aj, Zj] = . (4.39) 
Therefore T^'^ has zero mean and we derive : 

= £(o)2iv^(F^|5^[^^^-^^^-] = 2Wiv(iv-i) ^[^^^^^^^]- 

By (I4.39p . we compute : 

E[Ti2T;2] = 2E[TuTl2] + 2E[TuT^,]-2E[b\Ai,Zi)]. 
We next estimate that |E[Ti2'7']'2]| is dominated by 

E 



a2 



(/?2(Ai,Zi) v^y v^y 

J (f{X^-hli,z) 

by the usual change of variables. Clearly, the first term on the right hand-side is of order 
0(/i2'^+"), while the second one is a ©(/i^'^+^^+S) by KMh . Similarly, we have E[Ti2T^^] = 
Q(j^2d+ny Moreover, E[h'^{Ai, Zi)] = 0{N^^h-'^-^). We deduce that 



Var(4'') = O [j;^2j^ 1 = o ( ) , (4.40) 



/ 1 \ / 1 



using the relations between and h given by (13.13 



2. We next rewrite T^'^ as 



-2d-n-l 
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By the usual change of variables, 

\0 



b{X, z) = J c^{z + hv) K (^^-^ - VK{l)H{v) dl dv 

/ ^{z) ipxiX^ -hl,z)K{l)dl. 



By direct calculation, it is easily checked that the second term is negligible. Then, by the 
usual change of variables, it follows that 

E[b{Ai,Zi)biAi,Ziy] 

~ /i3rf+2n y |y ^ hv)K{l2 - h)VK{h)H{v) dh dv^ V9(A° - hh, z) dh dz . 

By Assumptions S and R, we deduce from the dominated convergence theorem together 
with the fact that E[6(Ai, Zi)] = that 

- j '^'(^){/ K{l2-h)'^K{h)dhY ^(>^''^^)dhdz. (4.41) 

The proof is completed by collecting the estimates (I4.40p and (|4.41l) into (|4.38ll . □ 
Lemma 4.5 E[r^] = o(/iP^«) andYa.r{T%) = o{N^^h~'^^^). 
Proof. We decompose ^ into the sum of 

,6,1 ._ ((^ - ^-') (fx 6,2 ._ ((^-^ - (p-'''^) ipx 6,3 

■^r • — / _ ic X o ) ''i AT ■ — / _ i N o ana t 



i,N ■ (^5)2 ' %N ■ (-5)2 ''''^ \N ■ (^5)2 ' 

le corresponding , T^'^ and T^'^ separately. 
1. It can be checked easily that T^'^ can be dealt with as T^. By the same calculation, we 

u-Ad—2n 



get E[ry] = and 

j^-Ad—2n 



where h{X^z) is given by : 



E 



(Zi)99A(A„Zi) l^^fKi-X\ ^(Z,-z\ 



99(Ai,Zj)2 y h J \ \ h J \ ^ 



The variables b{Ai,Zi) have also zero mean and, as in the proof of Lemma [44l the usual 
change of variables implies that 



Var(6(A„Zi)) ~ J [Geih, z)f ^piX^ - hh, z) dh dz , 

with Geih, z) := / (f)(z + hv) — (X° + hh - hh, z + hv)K(l2 - h)K{h)H{v) dh dv. 
J 
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By the continuity and the uniform boundedness of (j) and ^p\/(p implied by Assumptions S 
and R, we derive 



1 



1 



Ar/id+2 



6 2 

2. We now turn to and compute 



< C sup 

i<N 



1 ^ 

7^ ^ 



^ / AO A . 



Therefore, we deduce from Cauchy-Schwarz inequality that 



E 



'TV 



< C 



sup 



E 



TV 

E 



A" -A, 



1/2 



and (IXT3]) combined with (14:331) lead to E 



'at 



o{hP^''). Similarly, we get 



VariT^^) < C 



sup 

i<TV 



E 



1 ^ 



./A°-A, 



1/4 



which leads to Var(Tn'^) = o {N~^h^'^~'^) . 

3. We finally observe that T^'"^ is treated similarly thanks to (|4.31|) . 
Lemma 4.6 E[r;^] = and Var(r^) = o{N-^h^'^^'^). 
Proof. Observe that 



□ 



t^(A,z) = t^(A, z)'0(A, z) where V ■= 



Following the lines of the proof of Lemma [44l we see that E[T^] = 0, and we estimate 



A^/i'^+Var(r^) 



[G'j{u,z)\ p{\ —hu,z)dudz, 

with Gjiu, z) := / (j){z + hv)i){\^ + hi - hu, z + hv)K{u - l)VK{l)H{v) dl dv . 

, = O^h^^'^) and, since if and (p are uniformly 



By (14:321) and (14361) it follows that 
bounded, we deduce that 

Var(r^) = O 



1 



h 



— n— 1 



Lemma 4.7 E [T%] 
andYaT{T%) = o{N-^h-'^-^) . 



H^^ j K{h - l2)K{h)VK{l2)dhdh 



□ 
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Proof. We split the proof it two steps. 

1. We first estimate E [T^] . We rewrite t%{X, z) as t^^(A, z) + t^jf{X, z) + t^pf{X, z) with 



8,2 
i,N 

,8,3 



t 



t 



^~i,5 (-(^5)2 



+ 



2 ^,05^2 



Then r|r = T^'-*^ + T^f + T^'^, where 



p8,fc 
-Af 



^(0) Nh'^ ^ 



for /c = 1,2,3. 



We now introduce 



so that 



A,; -A, 



E 



E 



E [f/i, Fifc I A„ Zi] = E [C/i,- 1 A„ Z,] E [y^fc I A„ Z,] = whenever j + k. 
Using this property, we compute directly that 

^-2d-2n-l 



E 



4\Ai,Zi)|Ai,Zi 



(iV-l)V(Ai,Zi) 

^-2d-2n-l 



E 



(iV-lV(A^^^^) 

Since the expectation of T^'^ is given by : 

h 



E[C/i2l42|Ai,^i] . 



E 



,8,1 



AT 



^(0) 



-E 



4>{Z^)K ( ^^^-^ 1 E 



f«';v(Ai,Zi)|Ai,Zi 



we derive by the usual change of variables. 



r; 



AT 



with ^8(^2, z) := J 



{z + hv) 



Gsih, z)(p{\^ - hl2, z) dl2 dz . 



-K{l2 - li)K{li)VK{h)H^{v) dh dv . 



ip{X° + Ml - hl2,z + hv)' 
Finally, by the continuity and the uniform boundedness of (p and <p, we derive : 

^—d—n—l 



E 



-N 



i{0)N 



{z)K{l2 - li)K{li)VK{li)H\v) dh dv dh dz . (4.42) 
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Furthermore, by Cauchy-Schwarz inequality and (|3.13l) . we have 

N 



E 



8,k 



N 



< 



sup 

i<N 



8,k 
i,N 



E 



1 



E 

i=l 



(Zi)K 



AO -A. 



1/2 



< c 



sup 

i<N 



S,k 
i,N 



A; = 2,3. 



(4.43) 



(4.44) 



Finally, combining relations (|4.21l) - (l4.36p . Cauchy-Schwarz inequality and (|3.13l) . we get 

1 



sup 

i<N 



8,2 



and 



sup 

i<N 



o 



1 



Therefore (I4.42p and ()4.43ll lead to the expected equivalent for E [r|r] . 

2. We now study the variance of T^. We first notice that the Cauchy-Schwarz inequality 

and (I3.13P lead to 

2 



Var[T%] < C 



sup t' 

i<N 



.N\ 



But, using again Cauchy-Schwarz inequality and relations (I3.13p . (|4.21l) . (|4.36p and (|4.34l) . 
we deduce that 



Var(rlr) = O 



(^]V2/^2rf+2n^ 



1 



□ 



Lemma 4.8 E[r^] = ©(iV-i/i-"^-") andYar{T^) = o{N-'^h- 



-d-2\ 



Proof. It can be easily checked that can be dealt as and, following the lines of the 
proof of Lemma [4?7] we obtain the announced result. 



4.3 Asymptotic bias and variance 

This section is devoted to the proof of Proposition 13.11 characterizing the asymptotic bias 
and variance of the double kernel based estimator (3n. 

Proof of Proposition 13. IL We split the proof in two steps. 

1. We first derive the expectation of Pn- 

Notice that = (3n as defined in p. 7(1 which satisfies 

E [Pn] = -J— I (j){z)K{l)s{\^ - hi, z)^{\^ - hi, z) dt dz . 
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The regularity of the function sip given by assumption R enables us to derive 

UP r 

^[^^] - - 77n^ hKWx] (A°, dz . (4.45) 



£(0) 

Using remark 14.21 we deduce from (I4.20p that we have 

Wl] = J^^l i^x] (A°, z)0(z) + i^x] (A°, z)Hz) dz + oiW^'^) 

We now rewrite t?jy as the sum of 

,3,1 _ ((^ - (p) fx 3.2 _ if^ - f) fx 

and study separately the corresponding T^^ and T^^. From (I4.19p . we derive 



^[^^ ] = -W)J ^^^^(^°'^)^(^)^^- ^ J ^^^{>^',-)mdz + o{hv-^). 

and we directly deduce from (IXT8D and (11:811 that E[r^'^] = ©(/iP^*). 
Note that 



Then, using gSS]), ((4:35]) and dOHl) . we derive ||ii^^Ar||oo = o{hP^'i) and Lemma gj 

leads to E(r^) = o(/iP^«) . 

Prom Lemmas 14.41 14.51 and 14.61 we have lE(T^) = for j = 5 ... 7 and Lemma 14.71 gives 

^[^n] ~ —-— ^f^K{l2-li)K{li)VK{h)H^{v)dhdvdl2dz. 

Finally, Lemma [13] tells us E[r^] = o{N~^h~'^-''-^). 

We then obtain E[/3jv] by summing up the ]E[T](r] for j = 1, . . . , 9. 

2. We then analyze the variance of P^. For any j = 1, ... ,4, expressions (I4.2ip . (|4.32l) . 
(li:35]l and S^Mh imply \\t%\\oo = 0(1). Then, Lemma lU leads to 

Var(r^) = o{N~^h^'^~^) for every j = 1, . . . , 4 . 

From Lemma [4:4] we get 

Var(r5 ) ^ I <P\z)y K{l2-h)VK{h)dhY f{\\z)dzdl2. (4.46) 

Indeed, Lemmas 14.51 to imply also 

Var{Ti^) = o{N-^h-'^-'^) for every j = 5, . . . , 9 . 
Hence, Cov{T^^,T^) = o{N~^h~'^~^) unless j = k = 5 and Var(/3iv) is given by expression 

Km . □ 



{f-f^)'^fx , if\-fx)if-f^) 
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4.4 Central limit theorem 

This section is devoted to the proof of Theorem l3.1l which provides a central limit theorem 
for the double kernel based estimator 15n- 

Proof of Proposition 13. IL As we saw in the proof of Proposition 13. H the variance of (3n 
is given by the variance of 

where h{\,z) := h'^^'' j (l){z + hv) K (^^^-^ - l^V K{l)H{v) dl dv 
- / (i){z) 99a(A° - hi, z)K{l) dl. 



As in the proofs of Theorems 4.1 or 4.2 in [6], using Kolmogorov's condition with the fourth 
moment of h and the Crame 
We then finally deduce that 



5 2 

moment of h and the Cramer- Wold device, we derive that is asymptotically normal. 



Under the additional condition Nh^^'^^'^^^^''^ — > 0, we conclude the proof denoting that the 
bias vanishes in the previous expression. □ 
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